MEDB 5502, Module 14, review

Topics to be covered, 1 of 2

  • What you will learn
    • 01 Linear regresion, analysis of variance
    • 02 Linear regression with multiple independent variables
    • 03 Analysis of covariance
    • 04 Multi-factor analysis of variance
    • 05 Dimension reduction
    • 06 Logistic regression
    • 07 Diagnostic tests

Topics to be covered, 2 of 2

  • What you will learn
    • 08 Survival analysis
    • 09 Meta-analysis
    • 10 Dark side of data science
    • 11 Hierarchical models
    • 12 Longitudinal data
    • 13 Bayesian statistics

Module 01, Review

  • Simple linear regression
  • One factor analysis of variance

Module 01, SPSS scatterplot

Module 01, SPSS boxplot

Module 01, SPSS calculation of R Square

Module 01, SPSS ANOVA table

Module 01, SPSS linear regression coefficients

Break #1

  • What you have learned
    • 01 Linear regresion, analysis of variance
  • What’s coming next
    • 02 Linear regression with multiple independent variables

Module 02, Linear regression with multiple independent variables

  • Analysis of variance table
    • R-squared
    • Partial F tests
  • Stepwise regression
  • Interpretation
  • Collinearity
  • Mediation

Module 02, Checking assumptions

  • Non-normality
    • Q-Q plot of residuals
  • Lack of independence
    • Assessed qualitatively
  • Unequal variances, Non-linearity
    • Residual scatterplot

Module 02, SPSS dialog box for the general linear model

Module 02, SPSS computation of R-squared

  • 10,548.480/15,079.017 = 0.70

Module 02, SPSS computation of change in R-squared

  • \(Partial\ R^2=0.700-0.693=0.007\)

Module 02, SPSS computation of partial F-test

Module 02, SPSS computation of full regression model

Module 02, SPSS computation of collinearity statistics

What is mediation?

  • “A situation when the relationship between a predictor variable and an outcome variable can be explained by their relationship to a third variable (the mediator)”
    • Andy Field, Section 11.4

An informal assessment of mediation

Module 02, SPSS Q-Q plot

Module 02, Scatterplot, 1 of 4

Module 02, Scatterplot, 2 of 4

Module 02, Scatterplot, 3 of 4

Module 02, Scatterplot, 4 of 4

Break #2

  • What you have learned
    • 02 Linear regression with multiple independent variables
  • What’s coming next
    • 03 Analysis of covariance

Module 03, Analysis of covariance

  • Confounding/covariate imbalance
  • Interpretation
  • Interactions

Module 03, Checking assumptions

  • Non-normality
    • Q-Q plot of residuals
  • Lack of independence
    • Assessed qualitatively
  • Unequal variances, Non-linearity
    • Residual scatterplots

Module 03, SPSS calculation of unadjusted estimates

Module 03, SPSS calculation of adjusted estimates

Module 03, SPSS visualization, 1 of 2

Module 03, SPSS visualization, 2 of 2

Module 03, SPSS Q-Q plot

Module 03, SPSS scatterplot

Testing for an interaction

Break #3

  • What you have learned
    • 03 Analysis of covariance
  • What’s coming next
    • 04 Multi-factor analysis of variance

Module 04, Multi-factor analysis of variance

  • Tukey post hoc test
  • Interaction

Module 04, Checking assumptions

  • Non-normality
    • Q-Q plot of residuals
  • Lack of independence
    • Assessed qualitatively
  • Unequal variances
    • Boxplots

Module 04, SPSS crosstabulation

Module 04, SPSS analysis of variance table

Module 04, SPSS removing irrelevant rows

Module 04, SPSS parameter estimates

Module 04, SPSS Tukey test

Module 04, SPSS Q-Q plot

Module 04, SPSS scatterplot

Box plots of exercise data

Mean values for the interaction

Analysis of variance table for interaction model

Parameter estimates for the interaction model

Interaction plot, 1 of 2

Interaction plot, 2 of 2

When you can’t estimate an interaction

  • Special case, n=1
    • Only one observation for categorical combination

Example, full moon study, 1 of 2

Example, full moon study, 2 of 2

Interaction between exercise program and hours spent exercising

Testing for interaction in analysis of covariance

Table with irrelevant rows removed

Parameter estimates

  • Intercept for prog=1, -8.997 + 2.216 = -6.781
  • Intercept for prog=2, 9.993 + 2.216 = 12.209
  • Intercept for prog=3, 2.216
  • Slope for prog=1, 10.409 + -2.956 = 7.453
  • Slope for prog=2, 9.83 + -2.956 = 6.874
  • Slope for prog=3, -2.956

Analysis of variance table

Table of means

Centered analysis

Weight loss at various conditions

  • hours = 2 (mean), effort = 30 (mean),
    • \(\hat Y\) = 10.005
  • hours = 4 (mean+2), effort = 30 (mean),
    • \(\hat y\) = 10.005 + 2.291*2 = 14.587
  • hours = 2 (mean), effort = 40 (mean+20)
    • \(\hat Y\) = 10.005 + 0.707*20 = 24.145
  • hours = 4 (mean+2), effort = 40 (mean+20)
    • \(\hat Y\) = 10.005 + 2.291*2 + 0.707*20 + 0.393*2*20 = 44.447

Line plots of means for unbalanced data

Table of means

Table of frequencies and column percentages

Break #4

  • What you have learned
    • 04 Multi-factor analysis of variance
  • What’s coming next
    • 05 Dimension reduction

Module 05, Dimension reduction

  • Principal components analysis
    • Eigenvectors, Eigenvalues
  • Factor analysis
    • Factor rotation

Correlation matrix, 1 of 3

Correlation matrix, 2 of 3

Correlation matrix, 3 of 3

Communalities

Eigenvalues

Scree plot

Component matrix

Boxplots of first four principal components

Scatterplot of first four principal components

R-squared using four principal components

R-squared using all 24 variables

Rotated factor pattern, 1 of 3

Rotated factor pattern, 2 of 3

Rotated factor pattern, 3 of 3

Break #5

  • What you have learned
    • 05 Dimension reduction
  • What’s coming next
    • 06 Logistic regression

Module 06, Precursors to logistic regression

  • Test of two proportions
  • Chi-square test of independence
  • Odds ratio versus relative risk

Module 06, Logistic regression

  • Linear on log odds scale
  • Assumptions
    • Independence
    • Linearity

Break #6

  • What you have learned
    • 06 Logistic regression
  • What’s coming next
    • 07 Diagnostic tests

Module 07, Diagnostic tests

  • Sensitivty, specificity
    • SpPin, SnNout
  • Positive/negative predictive value
  • Likelihood ratio
  • ROC curve

Break #7

  • What you have learned
    • 07 Diagnostic tests
  • What’s coming next
    • 08 Survival analysis

Module 08, Basic survival analysis

  • Censoring
  • Kaplan-Meier curve
  • Log rank test

Module 08, Cox regression

  • Define hazard function
    • Increasing/decreasing/constant hazard
    • Hazard ratio
  • Assumptions
    • Independence
    • Non-informative censoring

Break #8

  • What you have learned
    • 08 Survival analysis
  • What’s coming next
    • 09 Meta-analysis

Module 09, Meta-analysis

  • Forest plot
  • Publication bias
    • Funnel plot
  • Heterogeneity
    • Cochran’s Q, I-squared

Break #9

  • What you have learned
    • 09 Meta-analysis
  • What’s coming next
    • 10 Dark side of data science

Module 10, Dark side of data science

  • Empiricism
  • Reification
  • Bias in data science

Break #10

  • What you have learned
    • 10 Dark side of data science
  • What’s coming next
    • 11 Hierarchical models

Module 11, Hierarchical models

  • Clustered data
  • Between and within cluster variation
    • Intraclass correlation

Module 11, Checking assumptions

  • Independence
    • Only between clusters
  • Normality
    • Within clusters
    • Between clusters

Break #11

  • What you have learned
    • 11 Hierarchical models
  • What’s coming next
    • 12 Longitudinal data

Module 12, Longitudinal data

  • Random intercepts model
  • Random slopes model

Module 12, Checking assumptions

  • Independence
    • Only between subjects
  • Normality
    • Residuals
    • Random intercepts/slopes
  • Linearity
    • Scatterplot of residuals

Break #12

  • What you have learned
    • 12 Longitudinal data
  • What’s coming next
    • 13 Bayesian statistics

Module 13, Bayesian statistics

  • Prior
    • Flat or non-informative prior
  • Likelihood
  • Posterior

Summary, 1 of 2

  • What you have learned
    • 01 Linear regresion, analysis of variance
    • 02 Linear regression with multiple independent variables
    • 03 Analysis of covariance
    • 04 Multi-factor analysis of variance
    • 05 Dimension reduction
    • 06 Logistic regression
    • 07 Diagnostic tests

Summary, 2 of 2

  • What you have learned
    • 08 Survival analysis
    • 09 Meta-analysis
    • 10 Dark side of data science
    • 11 Hierarchical models
    • 12 Longitudinal data
    • 13 Bayesian statistics